Abbreviation Disambiguation: Experiments with Various Variants of the One Sense per Discourse Hypothesis

نویسندگان

  • Yaakov HaCohen-Kerner
  • Ariel Kass
  • Ariel Peretz
چکیده

Abbreviations are widely used in many languages and disambiguation of abbreviations is critical. In this research, a structured process that attempts to solve the problem of abbreviation ambiguity is presented. Various baseline methods have been explored, including context-related methods and statistical methods. Almost all methods are domain-independent and language independent. The application domain is Jewish Law documents written in Hebrew, which are known to be rich in ambiguous abbreviations. Several implementations of the one sense per discourse hypothesis are used, improving the baseline methods with new variants. Several common machine learning methods have been tested to find a successful integration of the baseline method variants. The best results have been achieved by LIBSVM, with 96.09% accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combined One Sense Disambiguation of Abbreviations

A process that attempts to solve abbreviation ambiguity is presented. Various contextrelated features and statistical features have been explored. Almost all features are domain independent and language independent. The application domain is Jewish Law documents written in Hebrew. Such documents are known to be rich in ambiguous abbreviations. Various implementations of the one sense per discou...

متن کامل

Automatic Resolution of Ambiguous Abbreviations in Biomedical Texts using Support Vector Machines and One Sense Per Discourse Hypothesis

We present an algorithm to disambiguate abbreviations in Medline abstracts using Support Vector Machines (SVM) and one sense per discourse hypothesis. In contrast to other work using SVM for natural language disambiguation which always depend on handcrafted training and testing data, the algorithm provided here automatically extracts the training and testing data through searching long form of ...

متن کامل

"One Entity per Discourse" and "One Entity per Collocation" Improve Named-Entity Disambiguation

The “one sense per discourse” (OSPD) and “one sense per collocation” (OSPC) hypotheses have been very influential in Word Sense Disambiguation. The goal of this paper is twofold: (i) to explore whether these hypotheses hold for entities, that is, whether several mentions in the same discourse (or the same collocation) tend to refer to the same entity or not, and (ii) test their impact in Named-...

متن کامل

Improving Japanese Zero Pronoun Resolution by Global Word Sense Disambiguation

This paper proposes unsupervised word sense disambiguation based on automatically constructed case frames and its incorporation into our zero pronoun resolution system. The word sense disambiguation is applied to verbs and nouns. We consider that case frames define verb senses and semantic features in a thesaurus define noun senses, respectively, and perform sense disambiguation by selecting th...

متن کامل

Disambiguating Temporal–Contrastive Discourse Connectives for Machine Translation

Temporal–contrastive discourse connectives (although, while, since, etc.) signal various types of relations between clauses such as temporal, contrast, concession and cause. They are often ambiguous and therefore difficult to translate from one language to another. We discuss several new and translation-oriented experiments for the disambiguation of a specific subset of discourse connectives in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008